time zone
The U.S. tried permanent daylight saving time--and hated it
The U.S. tried permanent daylight saving time--and hated it In 1974, America set its clocks forward for good in the name of energy savings. Between January and September in 1974, President Richard Nixon made daylight saving time permanent for a brief period. Breakthroughs, discoveries, and DIY tips sent every weekday. As fall approaches, so too does the end of daylight savings time (DST). On November 2nd, the hour between 1 a.m. and 2 a.m. will happen twice.
- Europe > Germany (0.05)
- Europe > United Kingdom (0.05)
- North America > United States > Alaska (0.05)
- (3 more...)
OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
Aggarwal, Pranjal, Kim, Seungone, Lanchantin, Jack, Welleck, Sean, Weston, Jason, Kulikov, Ilia, Saha, Swarnadeep
Thinking LLMs solve complex tasks at the expense of increased compute and overthinking on simpler problems, while non-thinking LLMs are faster and cheaper but underthink on harder reasoning problems. This has led to the development of separate thinking and non-thinking LLM variants, leaving the onus of selecting the optimal model for each query on the end user. We introduce OptimalThinkingBench, a unified benchmark that jointly evaluates overthinking and underthinking in LLMs and also encourages the development of optimally-thinking models that balance performance and efficiency. Our benchmark comprises two sub-benchmarks: OverthinkingBench, featuring simple math and general queries in 72 domains, and UnderthinkingBench, containing 11 challenging reasoning tasks along with harder math problems. Using novel thinking-adjusted accuracy metrics, we extensively evaluate 33 different thinking and non-thinking models and show that no model is able to optimally think on our benchmark. Thinking models often overthink for hundreds of tokens on the simplest user queries without improving performance. In contrast, large non-thinking models underthink, often falling short of much smaller thinking models. We further explore several methods to encourage optimal thinking, but find that these approaches often improve on one sub-benchmark at the expense of the other, highlighting the need for better unified and optimal models in the future.
- Europe (0.68)
- Asia > Russia > Siberian Federal District (0.28)
- Asia > Russia > Far Eastern Federal District (0.28)
- Leisure & Entertainment (1.00)
- Health & Medicine (1.00)
- Media > Music (0.94)
- Education (0.68)
Around the World in 24 Hours: Probing LLM Knowledge of Time and Place
Holtermann, Carolin, Röttger, Paul, Lauscher, Anne
Reasoning over time and space is essential for understanding our world. However, the abilities of language models in this area are largely unexplored as previous work has tested their abilities for logical reasoning in terms of time and space in isolation or only in simple or artificial environments. In this paper, we present the first evaluation of the ability of language models to jointly reason over time and space. To enable our analysis, we create GeoTemp, a dataset of 320k prompts covering 289 cities in 217 countries and 37 time zones. Using GeoTemp, we evaluate eight open chat models of three different model families for different combinations of temporal and geographic knowledge. We find that most models perform well on reasoning tasks involving only temporal knowledge and that overall performance improves with scale. However, performance remains constrained in tasks that require connecting temporal and geographical information. We do not find clear correlations of performance with specific geographic regions. Instead, we find a significant performance increase for location names with low model perplexity, suggesting their repeated occurrence during model training. We further demonstrate that their performance is heavily influenced by prompt formulation - a direct injection of geographical knowledge leads to performance gains, whereas, surprisingly, techniques like chain-of-thought prompting decrease performance on simpler tasks.
- Oceania (1.00)
- North America (1.00)
- Asia > Middle East > Iran (0.14)
- (3 more...)
The Morning After: NASA has to make a time zone for the Moon
The White House has published a policy memo asking NASA to create a new time standard for the Moon by 2026. Coordinated Lunar Time (LTC) will establish an official time reference to help guide future lunar missions. The US, China, Japan, India and Russia have space missions to the Moon planned or completed. The European Space Agency is also trying to make a time zone outside of Earth's… zone. Given the Moon's weaker gravity, time moves slightly faster there. "The same clock we have on Earth would move at a different rate on the Moon," NASA space communications and navigation chief Kevin Coggins told Reuters.
- North America > United States (1.00)
- Europe > Russia (0.26)
- Asia > Russia (0.26)
- (3 more...)
- Government > Space Agency (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
What Time Is It on the Moon?
In 2025, astronauts will begin returning to the moon, eventually building bases and space stations, putting robotic landers and rovers to work, and mining for resources. In this bustling new era of lunar activity, they'll need to synchronize with each other. But so far there is no agreed-upon time system or zones, and there's neither GPS nor internet on the moon. Setting those up will require developing new technologies on Earth to be deployed 239,000 miles away. Javier Ventura-Traveset, an engineer at the European Space Agency, is leading this work with a project called Moonlight, which aims to design satellites for astronauts and robotic explorers.
- North America > United States (0.19)
- North America > Canada (0.06)
- Europe (0.06)
- (2 more...)
Data Engineer, Measurement and Attribution at Block - Toronto, ON, Canada
As a Data Engineer working on Measurement, you will be tasked with maintaining and developing critical data solutions used to understand conversion attribution and demystify holistic, cross-channel performance. You will be the technical owner of our multi-touch, cross-channel conversion attribution model and pipelines. In this role, you will directly contribute to Square's growth, and closely inform investment within go-to-market channels across Sales, Marketing, SEO, etc. You will be uniquely positioned at the center of our attribution data, helping product, platform, and channel teams alike interpret performance and develop strategy. We want employees to be able to reside where they feel most creative and productive.
Using the profile of publishers to predict barriers across news articles
Sittar, Abdul, Mladenic, Dunja
Detection of news propagation barriers, being economical, cultural, political, time zonal, or geographical, is still an open research issue. We present an approach to barrier detection in news spreading by utilizing Wikipedia-concepts and metadata associated with each barrier. Solving this problem can not only convey the information about the coverage of an event but it can also show whether an event has been able to cross a specific barrier or not. Experimental results on IPoNews dataset (dataset for information spreading over the news) reveals that simple classification models are able to detect barriers with high accuracy. We believe that our approach can serve to provide useful insights which pave the way for the future development of a system for predicting information spreading barriers over the news.
Walk a Mile in Their Shoes
Jenna Butler is an adjunct professor at Bellevue College, Bellevue, WA, in the radiation therapy department and is a senior applied research scientist at Microsoft Research, Redmond, WA, USA. Catherine Yeh is a senior at Williams College, Williamson, MA, USA, where she studies computer science and cognitive science.
- North America > United States > Washington > King County > Redmond (0.24)
- North America > United States > Washington > King County > Bellevue (0.24)
- Asia > Middle East > Republic of Türkiye (0.04)
- Asia > Japan (0.04)
- Health & Medicine > Consumer Health (1.00)
- Law (0.94)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.47)
Senior Backend Engineer - Machine Learning Platform at Spotify
We are looking for a Senior Backend Engineer to help us define and build the next generation of model inference within the Machine Learning Platform at Spotify. Our team's mission is to enable Spotifiers to scale their model inference – real-time, batch, or on-device. We are calculating predictions at the scale of 430M monthly active users. You will advance high-volume, real-time inference and large-scale prediction logging for future training. ML allows us to solve problems at scale, growing our impact faster than we grow our resources.
- Media > Music (0.80)
- Leisure & Entertainment (0.80)
- Information Technology (0.76)
What Is the Unix Epoch, and How Does Unix Time Work?
And that means Linux does too. We explain this seemingly odd system, and why doomsday was scheduled for 2038. Goethe (1749-1832) declared "Every second is of infinite value." That's true, we each only have so many seconds here on planet Earth, and we don't know when our last second will be. But we do know our birthday, and when our mortal countdown started.
- Information Technology > Software (0.58)
- Information Technology > Artificial Intelligence (0.48)